AITopics | newton-type method

Safe and Sparse Newton Method for Entropic-Regularized Optimal Transport

Neural Information Processing SystemsMar-22-2026, 19:10:42 GMT

Computational optimal transport (OT) has received massive interests in the machine learning community, and great advances have been gained in the direction of entropic-regularized OT. The Sinkhorn algorithm, as well as its many improved versions, has become the solution to large-scale OT problems. However, most of the existing methods behave like first-order methods, which typically require a large number of iterations to converge. More recently, Newton-type methods using sparsified Hessian matrices have demonstrated promising results on OT computation, but there still remain a lot of unresolved open questions. In this article, we make major new progresses towards this direction: first, we propose a novel Hessian sparsification scheme that promises a strict control of the approximation error; second, based on this sparsification scheme, we develop a Newton-type method that is guaranteed to avoid singularity in computing the search directions; third, the developed algorithm has a clear implementation for practical use, avoiding most hyperparameter tuning; and remarkably, we provide rigorous global and local convergence analysis of the proposed algorithm, which is lacking in the prior literature. Various numerical experiments are conducted to demonstrate the effectiveness of the proposed algorithm in solving large-scale OT problems.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Add feedback

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization

Neural Information Processing SystemsMar-17-2026, 01:36:50 GMT

For distributed computing environment, we consider the empirical risk minimization problem and propose a distributed and communication-efficient Newton-type optimization method. At every iteration, each worker locally finds an Approximate NewTon (ANT) direction, which is sent to the main driver. The main driver, then, averages all the ANT directions received from workers to form a Globally Improved ANT (GIANT) direction. GIANT is highly communication efficient and naturally exploits the trade-offs between local computations and global communications in that more local computations result in fewer overall rounds of communications. Theoretically, we show that GIANT enjoys an improved convergence rate as compared with first-order methods and existing distributed Newton-type methods. Further, and in sharp contrast with many existing distributed Newton-type methods, as well as popular first-order methods, a highly advantageous practical feature of GIANT is that it only involves one tuning parameter. We conduct large-scale experiments on a computer cluster and, empirically, demonstrate the superior performance of GIANT.

artificial intelligence, name change, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.41)

Add feedback

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization

Neural Information Processing SystemsNov-20-2025, 23:06:08 GMT

For distributed computing environment, we consider the empirical risk minimization problem and propose a distributed and communication-efficient Newton-type optimization method. At every iteration, each worker locally finds an Approximate NewTon (ANT) direction, which is sent to the main driver. The main driver, then, averages all the ANT directions received from workers to form a Globally Improved ANT (GIANT) direction. GIANT is highly communication efficient and naturally exploits the trade-offs between local computations and global communications in that more local computations result in fewer overall rounds of communications. Theoretically, we show that GIANT enjoys an improved convergence rate as compared with first-order methods and existing distributed Newton-type methods. Further, and in sharp contrast with many existing distributed Newton-type methods, as well as popular first-order methods, a highly advantageous practical feature of GIANT is that it only involves one tuning parameter. We conduct large-scale experiments on a computer cluster and, empirically, demonstrate the superior performance of GIANT.

globally improved approximate newton method, name change, optimization, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.41)

Add feedback

Safe and Sparse Newton Method for Entropic-Regularized Optimal Transport

Neural Information Processing SystemsMay-27-2025, 20:27:33 GMT

Computational optimal transport (OT) has received massive interests in the machine learning community, and great advances have been gained in the direction of entropic-regularized OT. The Sinkhorn algorithm, as well as its many improved versions, has become the de facto solution to large-scale OT problems. However, most of the existing methods behave like first-order methods, which typically require a large number of iterations to converge. More recently, Newton-type methods using sparsified Hessian matrices have demonstrated promising results on OT computation, but there still remain a lot of unresolved open questions. In this article, we make major new progresses towards this direction: first, we propose a novel Hessian sparsification scheme that promises a strict control of the approximation error; second, based on this sparsification scheme, we develop a safe Newton-type method that is guaranteed to avoid singularity in computing the search directions; third, the developed algorithm has a clear implementation for practical use, avoiding most hyperparameter tuning; and remarkably, we provide rigorous global and local convergence analysis of the proposed algorithm, which is lacking in the prior literature.

algorithm, entropic-regularized optimal transport, safe and sparse newton method, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

Reviews: DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

Neural Information Processing SystemsJan-25-2025, 21:30:23 GMT

In this paper, the authors propose a distributed Newton method for gradient-norm optimization. The method does not impose any specific form on the underlying objective function. The authors present convergence analysis for the method and illustrate the performance of the method on a convex problem (in the main paper). Originality: The topic of the paper, in my opinion, is very interesting. The paper presents an efficient Newton method that is motivated via the optimization of the norm of the gradient.

dingo, gradient-norm optimization, newton-type method, (12 more...)

Neural Information Processing Systems

Genre:

Summary/Review (0.37)
Personal > Opinion (0.37)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.59)

Add feedback

Proximal Newton-type methods for convex optimization

Neural Information Processing SystemsMar-14-2024, 02:14:15 GMT

R is a convex but not necessarily differentiable function whose proximal mapping can be evaluated efficiently. We derive a generalization of Newton-type methods to handle such convex but nonsmooth objective functions. We prove such methods are globally convergent and achieve superlinear rates of convergence in the vicinity of an optimal solution. We also demonstrate the performance of these methods using problems of relevance in machine learning and statistics.

approximation, newton-type method, subproblem, (14 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > California > Santa Clara County > Stanford (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Lifted contact dynamics for efficient optimal control of rigid body systems with contacts

Katayama, Sotaro, Ohtsuka, Toshiyuki

arXiv.org Artificial IntelligenceOct-24-2022

We propose a novel and efficient lifting approach for the optimal control of rigid-body systems with contacts to improve the convergence properties of Newton-type methods. To relax the high nonlinearity, we consider the state, acceleration, contact forces, and control input torques, as optimization variables and the inverse dynamics and acceleration constraints on the contact frames as equality constraints. We eliminate the update of the acceleration, contact forces, and their dual variables from the linear equation to be solved in each Newton-type iteration in an efficient manner. As a result, the computational cost per Newton-type iteration is almost identical to that of the conventional non-lifted Newton-type iteration that embeds contact dynamics in the state equation. We conducted numerical experiments on the whole-body optimal control of various quadrupedal gaits subject to the friction cone constraints considered in interior-point methods and demonstrated that the proposed method can significantly increase the convergence speed to more than twice that of the conventional non-lifted approach.

artificial intelligence, constraint, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2108.01781

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (0.95)
Information Technology > Control Systems (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Newton-type Methods for Minimax Optimization

Zhang, Guojun, Wu, Kaiwen, Poupart, Pascal, Yu, Yaoliang

arXiv.org Machine LearningJun-25-2020

To account for the sequential and nonconvex nature, new solution concepts and algorithms have been developed. In this work, we provide a detailed analysis of existing algorithms and relate them to two novel Newton-type algorithms. We argue that our Newton-type algorithms nicely complement existing ones in that (a) they converge faster to (strict) local minimax points; (b) they are much more effective when the problem is ill-conditioned; (c) their computational complexity remains similar. We verify our theoretical results by conducting experiments on training GANs.

algorithm, gdn, local minimax point, (14 more...)

arXiv.org Machine Learning

2006.14592

Country:

North America > Canada > Ontario (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization

Wang, Shusen, Roosta, Fred, Xu, Peng, Mahoney, Michael W.

Neural Information Processing SystemsFeb-14-2020, 10:26:19 GMT

For distributed computing environment, we consider the empirical risk minimization problem and propose a distributed and communication-efficient Newton-type optimization method. At every iteration, each worker locally finds an Approximate NewTon (ANT) direction, which is sent to the main driver. The main driver, then, averages all the ANT directions received from workers to form a Globally Improved ANT (GIANT) direction. GIANT is highly communication efficient and naturally exploits the trade-offs between local computations and global communications in that more local computations result in fewer overall rounds of communications. Theoretically, we show that GIANT enjoys an improved convergence rate as compared with first-order methods and existing distributed Newton-type methods.

globally improved approximate newton method, newton-type method, optimization, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass Classification Problems

Fang, Chih-Hao, Kylasa, Sudhir B, Roosta, Fred, Mahoney, Michael W., Grama, Ananth

arXiv.org Machine LearningFeb-4-2020

First-order optimization methods, such as stochastic gradient descent (SGD) and its variants, are widely used in machine learning applications due to their simplicity and low per-iteration costs. However, they often require larger numbers of iterations, with associated communication costs in distributed environments. In contrast, Newton-type methods, while having higher per-iteration costs, typically require a significantly smaller number of iterations, which directly translates to reduced communication costs. In this paper, we present a novel distributed optimizer for classification problems, which integrates a GPU-accelerated Newton-type solver with the global consensus formulation of Alternating Direction of Method Multipliers (ADMM). By leveraging the communication efficiency of ADMM, GPU-accelerated inexact-Newton solver, and an effective spectral penalty parameter selection strategy, we show that our proposed method (i) yields better generalization performance on several classification problems; (ii) significantly outperforms state-of-the-art methods in distributed time to solution; and (iii) offers better scaling on large distributed platforms.

dataset, newton-admm, solver, (15 more...)

arXiv.org Machine Learning

1807.07132

Country: